In this Take-home Exercise, I will explore to reveal the social areas of the city of Engagement, Ohio USA. I will also visualise and analyse locations with traffice bottleneck of the city.
Challenge 2: Patterns of Life considers the patterns of daily life throughout the city. You will describe the daily routines for some representative people, characterize the travel patterns to identify potential bottlenecks or hazards, and examine how these patterns change over time and seasons.
In Challenge 2, you will use visual analytic techniques to address these questions:
Assuming the volunteers are representative of the city’s population, characterize the distinct areas of the city that you identify. For each area you identify, provide your rationale and supporting data. Limit your response to 10 images and 500 words.
Where are the busiest areas in Engagement? Are there traffic bottlenecks that should be addressed? Explain your rationale. Limit your response to 10 images and 500 words.
In this take-home exercise, appropriate visual analytics methods are used to reveal the social areas as well as the locations with traffic bottleneck of the city of Engagement, Ohio USA while addressing the questions stated in the Task section.
The data are processed by using appropriate tidyverse family of packages.
The picture below shows a sketch of the initial design proposed.
Before we get started, it is important for us to ensure that the required R packages have been installed. If yes, we will load the R packages. If they have yet to be installed, we will install the R packages and load them onto R environment.
The chunk code below will do the trick.
packages = c('sf', 'tmap', 'tidyverse', 'lubridate', 'clock', 'sftime', 'rmarkdown')
for(p in packages){
if(!require(p, character.only = T)){
install.packages(p)
}
library(p, character.only = T)
}
In the code chunk below, read_sf()
of sf
package is used to parse School.csv, Pubs.csv,
Apartments.csv, Buildings.csv, Employer.csv,
and Restaurants.csv from the data folder into R as sf
data.frames. The 6 files parsed represented the various social areas we
would like to explore.
schools <- read_sf("data/wkt/Schools.csv",
options = "GEOM_POSSIBLE_NAMES=location")
pubs <- read_sf("data/wkt/Pubs.csv",
options = "GEOM_POSSIBLE_NAMES=location")
apartments <- read_sf("data/wkt/Apartments.csv",
options = "GEOM_POSSIBLE_NAMES=location")
buildings <- read_sf("data/wkt/Buildings.csv",
options = "GEOM_POSSIBLE_NAMES=location")
employers <- read_sf("data/wkt/Employers.csv",
options = "GEOM_POSSIBLE_NAMES=location")
restaurants <- read_sf("data/wkt/Restaurants.csv",
options = "GEOM_POSSIBLE_NAMES=location")
print(buildings)
Simple feature collection with 1042 features and 4 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: -4762.191 ymin: -30.08359 xmax: 2650 ymax: 7850.037
CRS: NA
# A tibble: 1,042 x 5
buildingId location buildingType maxOccupancy
<chr> <POLYGON> <chr> <chr>
1 1 ((350.0639 4595.666, 390.0633~ Commercial ""
2 2 ((-1926.973 2725.611, -1948.1~ Residental "12"
3 3 ((685.6846 1552.131, 645.9985~ Commercial ""
4 4 ((-976.7845 4542.382, -1053.2~ Commercial ""
5 5 ((1259.306 3572.727, 1299.255~ Residental "2"
6 6 ((478.8969 1082.484, 473.6596~ Commercial ""
7 7 ((-1920.823 615.7447, -1960.8~ Residental ""
8 8 ((-3302.657 5394.354, -3301.5~ Commercial ""
9 9 ((-600.5789 4429.228, -495.95~ Commercial ""
10 10 ((-68.75908 5379.924, -28.782~ Residental "5"
# ... with 1,032 more rows, and 1 more variable: units <chr>
print(restaurants)
Simple feature collection with 20 features and 4 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: -4131.414 ymin: 1194.129 xmax: 1407.711 ymax: 7236.526
CRS: NA
# A tibble: 20 x 5
restaurantId foodCost maxOccupancy location
<chr> <chr> <chr> <POINT>
1 445 5.15 71 (631.5131 2001.477)
2 446 4.17 82 (413.84 1194.129)
3 447 5.87 119 (497.9968 1624.515)
4 448 4.07 98 (698.2411 4392.417)
5 449 5.11 53 (1407.711 4010.457)
6 895 4.44 103 (-1623.074 3872.03)
7 896 5.86 107 (-2126.172 4285.182)
8 897 5.76 104 (-1989.635 3368.765)
9 898 4.22 115 (-1771.452 4343.58)
10 899 5.65 85 (-820.929 4497.243)
11 1345 4.7 72 (-1176.801 5846.647)
12 1346 5.85 74 (-3529.558 4809.641)
13 1347 4.41 87 (-1221.405 4941.003)
14 1348 4.23 57 (-508.1269 6307.584)
15 1349 4.77 104 (-1176.699 4726.118)
16 1801 5.51 64 (-3136.128 6671.917)
17 1802 5.08 72 (-4002.934 6671.997)
18 1803 4.35 94 (-3182.886 7167.609)
19 1804 5.92 48 (-4131.414 7236.526)
20 1805 5.59 85 (-3927.91 5494.64)
# ... with 1 more variable: buildingId <chr>
The code chunk below plots the building polygon features by using tm_polygon()
of tmap.
tmap_mode("view")
tm_shape(buildings)+
tm_polygons(col = "grey60",
size = 1,
border.col = "black",
border.lwd = 1)
tmap_mode("plot")
The code chunk below is used to plot a composite map by combining the simple feature data.frames of the buildings and the various social areas.
tmap_mode("view") +
tm_shape(buildings)+
tm_polygons(col = "grey60",
size = 1,
border.col = "black",
border.lwd = 1) +
tm_shape(employers) +
tm_dots(col = "red") +
tm_shape(apartments) +
tm_dots(col = "lightblue") +
tm_shape(pubs) +
tm_dots(col = "green") +
tm_shape(restaurants) +
tm_dots(col = "blue") +
tm_shape(schools) +
tm_dots(col = "yellow")
tmap_mode("plot")
We arrive at the following points regarding social areas of the city:
Central Region: Central business district of the city, with greatest number of companies/restaurants/pubs located in the center of this region. The inner region is surrounded by residential apartments.
North-West Region: Mature estate with densely populated apartments as well as its own list of companies. Reasonably numbers of restaurants, pubs and schools are also seen in this region.
South Region & East Region: Non-mature estates with relatively less apartments and companies. The number of restaurants, pubs and schools are proportionally less too.
In the code chunk below, read_sf()
of sf
package is used to parse ParticipantStatusLogs1.csv from the
rawdata folder into R as a simple feature data.frame.
logs <- read_sf("rawdata/ActivityLogs/ParticipantStatusLogs1.csv",
options = "GEOM_POSSIBLE_NAMES=currentLocation")
logs_selected <- logs %>%
mutate(Timestamp = date_time_parse(timestamp,
zone = "",
format = "%Y-%m-%dT%H:%M:%S")) %>%
mutate(day = get_day(Timestamp)) %>%
filter(currentMode == "Transport")
Data frame logs_selected is saved in RDS format to avoid uploading large files to Git.
saveRDS(logs_selected, 'data/logs_selected')
logs_selected <- readRDS('data/logs_selected')
In the code chunk below, st_make_grid() of
sf package is used to create hexagons.
hex <- st_make_grid(buildings,
cellsize=100,
square=FALSE) %>%
st_sf() %>%
rowid_to_column('hex_id')
plot(hex)
The code chunk below perform point in polygon overlay by using
st_join() of sf package.
points_in_hex <- st_join(logs_selected,
hex,
join=st_within)
plot(points_in_hex, pch='.')
In the code chunk below, st_join() of
sf package is used to count the number of event points
in the hexagons.
points_in_hex <- st_join(logs_selected,
hex,
join=st_within) %>%
st_set_geometry(NULL) %>%
count(name='pointCount', hex_id)
head(points_in_hex)
# A tibble: 6 x 2
hex_id pointCount
<int> <int>
1 169 35
2 212 56
3 225 21
4 226 94
5 227 22
6 228 45
In the code chunk below, left_join() of
dplyr package is used to perform a left-join by using
hex as the target table and points_in_hex as the join
table. The join ID is hex_id.
In the code chunk below, tmap package is used to create the hexagon binning map.
tm_shape(hex_combined %>%
filter(pointCount > 0))+
tm_fill("pointCount",
style = "headtails",
title="Point Count") +
tm_borders(alpha = 0.1) +
tm_layout(main.title = "The City of Engagement",
main.title.position = "center",
main.title.color = "purple")
The darker-colored points are the busiest areas in Engagement.
The possible traffic bottlenecks (as seen in the figure below) are:
The path joining Central Region to North-West Region
The path joining South Region to East Region
We also notice that these 2 paths are indeed the only paths possible connecting the regions together.